Overview

Dataset statistics

Number of variables30
Number of observations897406
Missing cells10106416
Missing cells (%)37.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory205.4 MiB
Average record size in memory240.0 B

Variable types

CAT25
NUM4
BOOL1

Reproduction

Analysis started2020-07-05 20:57:27.256867
Analysis finished2020-07-05 20:58:57.947360
Duration1 minute and 30.69 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

CRASH_RECORD_ID has a high cardinality: 410559 distinct values High cardinality
RD_NO has a high cardinality: 407135 distinct values High cardinality
CRASH_DATE has a high cardinality: 265855 distinct values High cardinality
CITY has a high cardinality: 9084 distinct values High cardinality
STATE has a high cardinality: 52 distinct values High cardinality
ZIPCODE has a high cardinality: 10074 distinct values High cardinality
DRIVERS_LICENSE_STATE has a high cardinality: 177 distinct values High cardinality
DRIVERS_LICENSE_CLASS has a high cardinality: 222 distinct values High cardinality
HOSPITAL has a high cardinality: 4356 distinct values High cardinality
EMS_AGENCY has a high cardinality: 5461 distinct values High cardinality
EMS_RUN_NO has a high cardinality: 846 distinct values High cardinality
VEHICLE_ID has 17276 (1.9%) missing values Missing
SEAT_NO has 721616 (80.4%) missing values Missing
CITY has 228441 (25.5%) missing values Missing
STATE has 222539 (24.8%) missing values Missing
ZIPCODE has 284715 (31.7%) missing values Missing
SEX has 12438 (1.4%) missing values Missing
AGE has 251813 (28.1%) missing values Missing
DRIVERS_LICENSE_STATE has 355637 (39.6%) missing values Missing
DRIVERS_LICENSE_CLASS has 422545 (47.1%) missing values Missing
AIRBAG_DEPLOYED has 16885 (1.9%) missing values Missing
EJECTION has 11022 (1.2%) missing values Missing
HOSPITAL has 733805 (81.8%) missing values Missing
EMS_AGENCY has 792989 (88.4%) missing values Missing
EMS_RUN_NO has 881237 (98.2%) missing values Missing
DRIVER_ACTION has 177503 (19.8%) missing values Missing
DRIVER_VISION has 177721 (19.8%) missing values Missing
PHYSICAL_CONDITION has 176972 (19.7%) missing values Missing
PEDPEDAL_ACTION has 880703 (98.1%) missing values Missing
PEDPEDAL_VISIBILITY has 880744 (98.1%) missing values Missing
PEDPEDAL_LOCATION has 880701 (98.1%) missing values Missing
BAC_RESULT has 176397 (19.7%) missing values Missing
BAC_RESULT VALUE has 896257 (99.9%) missing values Missing
CELL_PHONE_USE has 896261 (99.9%) missing values Missing
CRASH_RECORD_ID is uniformly distributed Uniform
RD_NO is uniformly distributed Uniform
PERSON_ID has unique values Unique

Variables

PERSON_ID
Categorical

UNIQUE

Distinct count897406
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.8 MiB
O422964
 
1
O461101
 
1
P153650
 
1
O871108
 
1
O504571
 
1
Other values (897401)
897401
ValueCountFrequency (%) 
O4229641< 0.1%
 
O4611011< 0.1%
 
P1536501< 0.1%
 
O8711081< 0.1%
 
O5045711< 0.1%
 
O1297341< 0.1%
 
O1391891< 0.1%
 
O2810511< 0.1%
 
O385121< 0.1%
 
O7218481< 0.1%
 
Other values (897396)897396> 99.9%
 

Length

Max length7
Median length7
Mean length6.788244117
Min length2

PERSON_TYPE
Categorical

Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.8 MiB
DRIVER
704250
PASSENGER
175790
PEDESTRIAN
 
10482
BICYCLE
 
5974
NON-MOTOR VEHICLE
 
750
ValueCountFrequency (%) 
DRIVER70425078.5%
 
PASSENGER17579019.6%
 
PEDESTRIAN104821.2%
 
BICYCLE59740.7%
 
NON-MOTOR VEHICLE7500.1%
 
NON-CONTACT VEHICLE160< 0.1%
 

Length

Max length19
Median length6
Mean length6.652549682
Min length6

CRASH_RECORD_ID
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count410559
Unique (%)45.7%
Missing0
Missing (%)0.0%
Memory size6.8 MiB
31ecf6862c691ff12d3856213b902c146b07337b42a5692e3a176a66d684d221028bb5118ef6d67a313bcaed9e97bee1855cb1f5e8650f49e8dc17663475a1ee
 
61
13026c7fb51566d9ca487a093e38c6f5621c2ec25be48c306b6574983b61daeee589524b96bb2bfe66ddd0f695c8d2bf3ab0297558528e9c7a70363c763d6bd1
 
50
c727dc759107cf17b2e8141149347128bb4bc26b026c7805562206c7c5761c543dd7cc0e47fc11379455a2ecbb2847c3d1744d6feb78f276d9a457e9beeb6121
 
45
1829f52c1281a0396ef94692331b3dc530bc4be5a54cd55e94c24a5e5e49b800fbcf9f24dabe4c8277c8964ad05aadc89e90fd94021959d6dff5fad55480d595
 
45
60fbcef1942e23e9d93ba3fd5dd7f1da2c5b1e4bcc30731363a0f9ed822b5971a38d109cace85875dc2a7321d8a28ba40f24548d9ddd728f600af3b08c036a70
 
44
Other values (410554)
897161
ValueCountFrequency (%) 
31ecf6862c691ff12d3856213b902c146b07337b42a5692e3a176a66d684d221028bb5118ef6d67a313bcaed9e97bee1855cb1f5e8650f49e8dc17663475a1ee61< 0.1%
 
13026c7fb51566d9ca487a093e38c6f5621c2ec25be48c306b6574983b61daeee589524b96bb2bfe66ddd0f695c8d2bf3ab0297558528e9c7a70363c763d6bd150< 0.1%
 
c727dc759107cf17b2e8141149347128bb4bc26b026c7805562206c7c5761c543dd7cc0e47fc11379455a2ecbb2847c3d1744d6feb78f276d9a457e9beeb612145< 0.1%
 
1829f52c1281a0396ef94692331b3dc530bc4be5a54cd55e94c24a5e5e49b800fbcf9f24dabe4c8277c8964ad05aadc89e90fd94021959d6dff5fad55480d59545< 0.1%
 
60fbcef1942e23e9d93ba3fd5dd7f1da2c5b1e4bcc30731363a0f9ed822b5971a38d109cace85875dc2a7321d8a28ba40f24548d9ddd728f600af3b08c036a7044< 0.1%
 
17afe1e4e4367dc3dcfca167fd32d6cab39e6a517863885610d5af743895769a916d20ac3094919b287724806ee9ccfba594abf57e9223a187eab3acd55b90be42< 0.1%
 
95741c5fca615c892ab34d22a08da8da4a04594d1b2b989ab96e7e43a59ce144152bd50793d5ca75a8590f6f30f5409c577a43b38ec9c8126bee83ff3c4367bf42< 0.1%
 
6b96de25ec212e980c0ab42fd01771d64737ccd9c3bed72fd26357d35f8ac00d8e3e1dfaed4c91f3159d8ed15ba4267946720443d9fb34a55b1839c5a4ae328140< 0.1%
 
61bee823dd33ba8c08b79c23806840fe0ca32f48aedc443c05b258b7d294bbef029e904d9b4b2a8eb10878d04fd6e488c018065a593216d26022c590ec70d64240< 0.1%
 
b8c1315393f8d3155c03fe6967516695ca7be422ca3fa936f3301f634f8775ae8025ef081804eaf0621bba6e00df09850edd742d5a82a0072937ef3a6cbae84839< 0.1%
 
Other values (410549)896958> 99.9%
 

Length

Max length128
Median length128
Mean length128
Min length128

RD_NO
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count407135
Unique (%)45.7%
Missing7331
Missing (%)0.8%
Memory size6.8 MiB
JB187770
 
61
JC317468
 
50
JA283201
 
45
JB253311
 
45
HZ139917
 
44
Other values (407130)
889830
ValueCountFrequency (%) 
JB18777061< 0.1%
 
JC31746850< 0.1%
 
JA28320145< 0.1%
 
JB25331145< 0.1%
 
HZ13991744< 0.1%
 
JC52503342< 0.1%
 
JB34334642< 0.1%
 
JB27488940< 0.1%
 
JB45125840< 0.1%
 
JB25774039< 0.1%
 
Other values (407125)88962799.1%
 
(Missing)73310.8%
 

Length

Max length8
Median length8
Mean length7.959154496
Min length3

VEHICLE_ID
Real number (ℝ≥0)

MISSING

Distinct count712476
Unique (%)81.0%
Missing17276
Missing (%)1.9%
Infinite0
Infinite (%)0.0%
Mean426379.0620896913
Minimum2.0
Maximum856686.0
Zeros0
Zeros (%)0.0%
Memory size6.8 MiB

Quantile statistics

Minimum2
5-th percentile42048.45
Q1213929.25
median425045.5
Q3639390
95-th percentile811141.55
Maximum856686
Range856684
Interquartile range (IQR)425460.75

Descriptive statistics

Standard deviation246268.9907
Coefficient of variation (CV)0.5775822796
Kurtosis-1.192782366
Mean426379.0621
Median Absolute Deviation (MAD)212832
Skewness0.005795908982
Sum3.752690039e+11
Variance6.064841576e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
33215560< 0.1%
 
64399747< 0.1%
 
16219944< 0.1%
 
36631144< 0.1%
 
2592043< 0.1%
 
41231241< 0.1%
 
75039941< 0.1%
 
46499839< 0.1%
 
37732339< 0.1%
 
66290238< 0.1%
 
Other values (712466)87969498.0%
 
(Missing)172761.9%
 
ValueCountFrequency (%) 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
61< 0.1%
 
71< 0.1%
 
ValueCountFrequency (%) 
8566861< 0.1%
 
8566801< 0.1%
 
8566781< 0.1%
 
8566741< 0.1%
 
8566701< 0.1%
 

CRASH_DATE
Categorical

HIGH CARDINALITY

Distinct count265855
Unique (%)29.6%
Missing0
Missing (%)0.0%
Memory size6.8 MiB
11/10/2017 10:30:00 AM
 
64
03/16/2018 10:17:00 AM
 
61
06/22/2019 06:15:00 PM
 
55
09/26/2018 07:30:00 AM
 
48
05/07/2018 09:35:00 AM
 
47
Other values (265850)
897131
ValueCountFrequency (%) 
11/10/2017 10:30:00 AM64< 0.1%
 
03/16/2018 10:17:00 AM61< 0.1%
 
06/22/2019 06:15:00 PM55< 0.1%
 
09/26/2018 07:30:00 AM48< 0.1%
 
05/07/2018 09:35:00 AM47< 0.1%
 
07/10/2018 09:10:00 AM45< 0.1%
 
05/28/2017 11:37:00 PM45< 0.1%
 
02/04/2016 09:30:00 AM44< 0.1%
 
11/10/2017 10:00:00 AM43< 0.1%
 
11/26/2019 10:50:00 AM42< 0.1%
 
Other values (265845)89691299.9%
 

Length

Max length22
Median length22
Mean length22
Min length22

SEAT_NO
Real number (ℝ≥0)

MISSING

Distinct count11
Unique (%)< 0.1%
Missing721616
Missing (%)80.4%
Infinite0
Infinite (%)0.0%
Mean4.202582626998122
Minimum1.0
Maximum12.0
Zeros0
Zeros (%)0.0%
Memory size6.8 MiB

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q36
95-th percentile10
Maximum12
Range11
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.181656361
Coefficient of variation (CV)0.5191227764
Kurtosis3.408036626
Mean4.202582627
Median Absolute Deviation (MAD)1
Skewness1.718990198
Sum738772
Variance4.759624479
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3830829.3%
 
6312653.5%
 
4245702.7%
 
586271.0%
 
172940.8%
 
266040.7%
 
744490.5%
 
1244300.5%
 
1037980.4%
 
1114560.2%
 
(Missing)72161680.4%
 
ValueCountFrequency (%) 
172940.8%
 
266040.7%
 
3830829.3%
 
4245702.7%
 
586271.0%
 
ValueCountFrequency (%) 
1244300.5%
 
1114560.2%
 
1037980.4%
 
8215< 0.1%
 
744490.5%
 

CITY
Categorical

HIGH CARDINALITY
MISSING

Distinct count9084
Unique (%)1.4%
Missing228441
Missing (%)25.5%
Memory size6.8 MiB
CHICAGO
470508
CICERO
 
6551
SKOKIE
 
4685
EVANSTON
 
3945
BERWYN
 
3840
Other values (9079)
179436
ValueCountFrequency (%) 
CHICAGO47050852.4%
 
CICERO65510.7%
 
SKOKIE46850.5%
 
EVANSTON39450.4%
 
BERWYN38400.4%
 
UNKNOWN34580.4%
 
OAK LAWN32530.4%
 
CALUMET CITY32530.4%
 
OAK PARK28520.3%
 
DES PLAINES25780.3%
 
Other values (9074)16404218.3%
 
(Missing)22844125.5%
 

Length

Max length48
Median length7
Mean length6.424887955
Min length1

STATE
Categorical

HIGH CARDINALITY
MISSING

Distinct count52
Unique (%)< 0.1%
Missing222539
Missing (%)24.8%
Memory size6.8 MiB
IL
638963
IN
 
10094
XX
 
4410
WI
 
3102
MI
 
2545
Other values (47)
 
15753
ValueCountFrequency (%) 
IL63896371.2%
 
IN100941.1%
 
XX44100.5%
 
WI31020.3%
 
MI25450.3%
 
FL16270.2%
 
TX12750.1%
 
OH12370.1%
 
CA12280.1%
 
IA9340.1%
 
Other values (42)94521.1%
 
(Missing)22253924.8%
 

Length

Max length3
Median length2
Mean length2.24798029
Min length2

ZIPCODE
Categorical

HIGH CARDINALITY
MISSING

Distinct count10074
Unique (%)1.6%
Missing284715
Missing (%)31.7%
Memory size6.8 MiB
60629
 
21968
60639
 
18302
60620
 
16323
60617
 
15730
60619
 
15148
Other values (10069)
525220
ValueCountFrequency (%) 
60629219682.4%
 
60639183022.0%
 
60620163231.8%
 
60617157301.8%
 
60619151481.7%
 
60623147271.6%
 
60632146181.6%
 
60651142981.6%
 
60628138831.5%
 
60641125551.4%
 
Other values (10064)45513950.7%
 
(Missing)28471531.7%
 

Length

Max length10
Median length5
Mean length4.368441932
Min length1

SEX
Categorical

MISSING

Distinct count4
Unique (%)< 0.1%
Missing12438
Missing (%)1.4%
Memory size6.8 MiB
M
470501
F
341353
X
 
65818
U
 
7296
ValueCountFrequency (%) 
M47050152.4%
 
F34135338.0%
 
X658187.3%
 
U72960.8%
 
(Missing)124381.4%
 

Length

Max length3
Median length1
Mean length1.027719895
Min length1

AGE
Real number (ℝ)

MISSING

Distinct count112
Unique (%)< 0.1%
Missing251813
Missing (%)28.1%
Infinite0
Infinite (%)0.0%
Mean38.07236757523703
Minimum-49.0
Maximum110.0
Zeros7974
Zeros (%)0.9%
Memory size6.8 MiB

Quantile statistics

Minimum-49
5-th percentile13
Q126
median36
Q350
95-th percentile68
Maximum110
Range159
Interquartile range (IQR)24

Descriptive statistics

Standard deviation17.08623032
Coefficient of variation (CV)0.4487829733
Kurtosis-0.2444726847
Mean38.07236758
Median Absolute Deviation (MAD)12
Skewness0.299186618
Sum24579254
Variance291.9392666
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
25183822.0%
 
27183192.0%
 
26182742.0%
 
24178792.0%
 
28177862.0%
 
29171771.9%
 
23166041.9%
 
30160181.8%
 
31154641.7%
 
22151741.7%
 
Other values (102)47451652.9%
 
(Missing)25181328.1%
 
ValueCountFrequency (%) 
-491< 0.1%
 
-11< 0.1%
 
079740.9%
 
120680.2%
 
219490.2%
 
ValueCountFrequency (%) 
1103< 0.1%
 
1096< 0.1%
 
1083< 0.1%
 
1074< 0.1%
 
1061< 0.1%
 

DRIVERS_LICENSE_STATE
Categorical

HIGH CARDINALITY
MISSING

Distinct count177
Unique (%)< 0.1%
Missing355637
Missing (%)39.6%
Memory size6.8 MiB
IL
499168
XX
 
10791
IN
 
8816
WI
 
2900
MI
 
2415
Other values (172)
 
17679
ValueCountFrequency (%) 
IL49916855.6%
 
XX107911.2%
 
IN88161.0%
 
WI29000.3%
 
MI24150.3%
 
FL18650.2%
 
CA13800.2%
 
TX13550.2%
 
OH11720.1%
 
IA8880.1%
 
Other values (167)110191.2%
 
(Missing)35563739.6%
 

Length

Max length3
Median length2
Mean length2.396294431
Min length2

DRIVERS_LICENSE_CLASS
Categorical

HIGH CARDINALITY
MISSING

Distinct count222
Unique (%)< 0.1%
Missing422545
Missing (%)47.1%
Memory size6.8 MiB
D
413592
A
 
15854
C
 
12980
B
 
12723
DM
 
8025
Other values (217)
 
11687
ValueCountFrequency (%) 
D41359246.1%
 
A158541.8%
 
C129801.4%
 
B127231.4%
 
DM80250.9%
 
AM19010.2%
 
CD15050.2%
 
BM11360.1%
 
E9510.1%
 
O7940.1%
 
Other values (212)54000.6%
 
(Missing)42254547.1%
 

Length

Max length3
Median length1
Mean length1.959476536
Min length1

SAFETY_EQUIPMENT
Categorical

Distinct count19
Unique (%)< 0.1%
Missing2520
Missing (%)0.3%
Memory size6.8 MiB
SAFETY BELT USED
461481
USAGE UNKNOWN
384451
NONE PRESENT
 
27775
CHILD RESTRAINT USED
 
7360
SAFETY BELT NOT USED
 
5407
Other values (14)
 
8412
ValueCountFrequency (%) 
SAFETY BELT USED46148151.4%
 
USAGE UNKNOWN38445142.8%
 
NONE PRESENT277753.1%
 
CHILD RESTRAINT USED73600.8%
 
SAFETY BELT NOT USED54070.6%
 
HELMET NOT USED26850.3%
 
HELMET USED13320.1%
 
CHILD RESTRAINT - FORWARD FACING11460.1%
 
BICYCLE HELMET (PEDACYCLIST INVOLVED ONLY)8180.1%
 
CHILD RESTRAINT - REAR FACING5840.1%
 
Other values (9)18470.2%
 
(Missing)25200.3%
 

Length

Max length42
Median length16
Mean length14.672359
Min length3

AIRBAG_DEPLOYED
Categorical

MISSING

Distinct count7
Unique (%)< 0.1%
Missing16885
Missing (%)1.9%
Memory size6.8 MiB
DID NOT DEPLOY
554673
DEPLOYMENT UNKNOWN
168052
NOT APPLICABLE
112337
DEPLOYED, FRONT
 
24591
DEPLOYED, COMBINATION
 
14138
Other values (2)
 
6730
ValueCountFrequency (%) 
DID NOT DEPLOY55467361.8%
 
DEPLOYMENT UNKNOWN16805218.7%
 
NOT APPLICABLE11233712.5%
 
DEPLOYED, FRONT245912.7%
 
DEPLOYED, COMBINATION141381.6%
 
DEPLOYED, SIDE63970.7%
 
DEPLOYED OTHER (KNEE, AIR, BELT, ETC.)333< 0.1%
 
(Missing)168851.9%
 

Length

Max length38
Median length14
Mean length14.68867603
Min length3

EJECTION
Categorical

MISSING

Distinct count5
Unique (%)< 0.1%
Missing11022
Missing (%)1.2%
Memory size6.8 MiB
NONE
833584
UNKNOWN
 
47701
TOTALLY EJECTED
 
3586
PARTIALLY EJECTED
 
971
TRAPPED/EXTRICATED
 
542
ValueCountFrequency (%) 
NONE83358492.9%
 
UNKNOWN477015.3%
 
TOTALLY EJECTED35860.4%
 
PARTIALLY EJECTED9710.1%
 
TRAPPED/EXTRICATED5420.1%
 
(Missing)110221.2%
 

Length

Max length18
Median length4
Mean length4.213658032
Min length3
Distinct count5
Unique (%)< 0.1%
Missing348
Missing (%)< 0.1%
Memory size6.8 MiB
NO INDICATION OF INJURY
828655
NONINCAPACITATING INJURY
 
37467
REPORTED, NOT EVIDENT
 
23090
INCAPACITATING INJURY
 
7462
FATAL
 
384
ValueCountFrequency (%) 
NO INDICATION OF INJURY82865592.3%
 
NONINCAPACITATING INJURY374674.2%
 
REPORTED, NOT EVIDENT230902.6%
 
INCAPACITATING INJURY74620.8%
 
FATAL384< 0.1%
 
(Missing)348< 0.1%
 

Length

Max length24
Median length23
Mean length22.95820286
Min length3

HOSPITAL
Categorical

HIGH CARDINALITY
MISSING

Distinct count4356
Unique (%)2.7%
Missing733805
Missing (%)81.8%
Memory size6.8 MiB
REFUSED
55144
DNA
24612
NONE
14476
99
 
6561
DECLINED
 
3528
Other values (4351)
59280
ValueCountFrequency (%) 
REFUSED551446.1%
 
DNA246122.7%
 
NONE144761.6%
 
9965610.7%
 
DECLINED35280.4%
 
REFUSED EMS21330.2%
 
HOLY CROSS21220.2%
 
UNKNOWN19410.2%
 
UNIVERSITY OF CHICAGO19270.2%
 
STROGER15740.2%
 
Other values (4346)495835.5%
 
(Missing)73380581.8%
 

Length

Max length50
Median length3
Mean length3.946278496
Min length1

EMS_AGENCY
Categorical

HIGH CARDINALITY
MISSING

Distinct count5461
Unique (%)5.2%
Missing792989
Missing (%)88.4%
Memory size6.8 MiB
DNA
21757
CFD
15122
REFUSED
11087
NONE
 
7374
99
 
6323
Other values (5456)
42754
ValueCountFrequency (%) 
DNA217572.4%
 
CFD151221.7%
 
REFUSED110871.2%
 
NONE73740.8%
 
9963230.7%
 
UNK6320.1%
 
UNKNOWN5960.1%
 
CFD AMB5860.1%
 
DECLINED443< 0.1%
 
REFUSED EMS295< 0.1%
 
Other values (5451)402024.5%
 
(Missing)79298988.4%
 

Length

Max length36
Median length3
Mean length3.236334502
Min length1

EMS_RUN_NO
Categorical

HIGH CARDINALITY
MISSING

Distinct count846
Unique (%)5.2%
Missing881237
Missing (%)98.2%
Memory size6.8 MiB
DNA
3092
NONE
 
1251
99
 
916
REFUSED
 
450
55
 
246
Other values (841)
10214
ValueCountFrequency (%) 
DNA30920.3%
 
NONE12510.1%
 
999160.1%
 
REFUSED4500.1%
 
55246< 0.1%
 
24219< 0.1%
 
36218< 0.1%
 
10217< 0.1%
 
23207< 0.1%
 
70195< 0.1%
 
Other values (836)91581.0%
 
(Missing)88123798.2%
 

Length

Max length17
Median length3
Mean length2.999472925
Min length1

DRIVER_ACTION
Categorical

MISSING

Distinct count20
Unique (%)< 0.1%
Missing177503
Missing (%)19.8%
Memory size6.8 MiB
NONE
266637
UNKNOWN
163244
FAILED TO YIELD
68901
OTHER
60349
FOLLOWED TOO CLOSELY
 
49040
Other values (15)
111732
ValueCountFrequency (%) 
NONE26663729.7%
 
UNKNOWN16324418.2%
 
FAILED TO YIELD689017.7%
 
OTHER603496.7%
 
FOLLOWED TOO CLOSELY490405.5%
 
IMPROPER BACKING237052.6%
 
IMPROPER LANE CHANGE200692.2%
 
IMPROPER TURN194262.2%
 
IMPROPER PASSING162311.8%
 
TOO FAST FOR CONDITIONS118651.3%
 
Other values (10)204362.3%
 
(Missing)17750319.8%
 

Length

Max length33
Median length5
Mean length7.908088424
Min length3

DRIVER_VISION
Categorical

MISSING

Distinct count14
Unique (%)< 0.1%
Missing177721
Missing (%)19.8%
Memory size6.8 MiB
NOT OBSCURED
404141
UNKNOWN
296256
OTHER
 
7712
MOVING VEHICLES
 
4754
PARKED VEHICLES
 
2700
Other values (9)
 
4122
ValueCountFrequency (%) 
NOT OBSCURED40414145.0%
 
UNKNOWN29625633.0%
 
OTHER77120.9%
 
MOVING VEHICLES47540.5%
 
PARKED VEHICLES27000.3%
 
WINDSHIELD (WATER/ICE)23790.3%
 
BLINDED - SUNLIGHT8670.1%
 
TREES, PLANTS329< 0.1%
 
BUILDINGS298< 0.1%
 
BLINDED - HEADLIGHTS71< 0.1%
 
Other values (4)178< 0.1%
 
(Missing)17772119.8%
 

Length

Max length22
Median length7
Mean length8.56407022
Min length3

PHYSICAL_CONDITION
Categorical

MISSING

Distinct count12
Unique (%)< 0.1%
Missing176972
Missing (%)19.7%
Memory size6.8 MiB
NORMAL
489163
UNKNOWN
217977
IMPAIRED - ALCOHOL
 
3506
REMOVED BY EMS
 
2775
FATIGUED/ASLEEP
 
1910
Other values (7)
 
5103
ValueCountFrequency (%) 
NORMAL48916354.5%
 
UNKNOWN21797724.3%
 
IMPAIRED - ALCOHOL35060.4%
 
REMOVED BY EMS27750.3%
 
FATIGUED/ASLEEP19100.2%
 
OTHER19070.2%
 
EMOTIONAL13790.2%
 
ILLNESS/FAINTED6250.1%
 
HAD BEEN DRINKING5880.1%
 
IMPAIRED - DRUGS400< 0.1%
 
Other values (2)204< 0.1%
 
(Missing)17697219.7%
 

Length

Max length28
Median length6
Mean length5.765636735
Min length3

PEDPEDAL_ACTION
Categorical

MISSING

Distinct count23
Unique (%)0.1%
Missing880703
Missing (%)98.1%
Memory size6.8 MiB
CROSSING - WITH SIGNAL
3548
WITH TRAFFIC
2843
OTHER ACTION
2204
UNKNOWN/NA
2197
CROSSING - AGAINST SIGNAL
 
872
Other values (18)
5039
ValueCountFrequency (%) 
CROSSING - WITH SIGNAL35480.4%
 
WITH TRAFFIC28430.3%
 
OTHER ACTION22040.2%
 
UNKNOWN/NA21970.2%
 
CROSSING - AGAINST SIGNAL8720.1%
 
NO ACTION8480.1%
 
NOT AT INTERSECTION8140.1%
 
CROSSING - NO CONTROLS (NOT AT INTERSECTION)5920.1%
 
AGAINST TRAFFIC5750.1%
 
CROSSING - NO CONTROLS (AT INTERSECTION)4760.1%
 
Other values (13)17340.2%
 
(Missing)88070398.1%
 

Length

Max length49
Median length3
Mean length3.276703075
Min length3

PEDPEDAL_VISIBILITY
Categorical

MISSING

Distinct count4
Unique (%)< 0.1%
Missing880744
Missing (%)98.1%
Memory size6.8 MiB
NO CONTRASTING CLOTHING
13194
CONTRASTING CLOTHING
 
2239
OTHER LIGHT SOURCE USED
 
816
REFLECTIVE MATERIAL
 
413
ValueCountFrequency (%) 
NO CONTRASTING CLOTHING131941.5%
 
CONTRASTING CLOTHING22390.2%
 
OTHER LIGHT SOURCE USED8160.1%
 
REFLECTIVE MATERIAL413< 0.1%
 
(Missing)88074498.1%
 

Length

Max length23
Median length3
Mean length3.362011174
Min length3

PEDPEDAL_LOCATION
Categorical

MISSING

Distinct count8
Unique (%)< 0.1%
Missing880701
Missing (%)98.1%
Memory size6.8 MiB
IN ROADWAY
7732
IN CROSSWALK
5561
UNKNOWN/NA
 
1325
BIKEWAY
 
841
NOT IN ROADWAY
 
733
Other values (3)
 
513
ValueCountFrequency (%) 
IN ROADWAY77320.9%
 
IN CROSSWALK55610.6%
 
UNKNOWN/NA13250.1%
 
BIKEWAY8410.1%
 
NOT IN ROADWAY7330.1%
 
DRIVEWAY ACCESS244< 0.1%
 
BIKE LANE229< 0.1%
 
SHOULDER40< 0.1%
 
(Missing)88070198.1%
 

Length

Max length15
Median length3
Mean length3.144167746
Min length3

BAC_RESULT
Categorical

MISSING

Distinct count4
Unique (%)< 0.1%
Missing176397
Missing (%)19.7%
Memory size6.8 MiB
TEST NOT OFFERED
710270
TEST REFUSED
 
7651
TEST PERFORMED, RESULTS UNKNOWN
 
1799
TEST TAKEN
 
1289
ValueCountFrequency (%) 
TEST NOT OFFERED71027079.1%
 
TEST REFUSED76510.9%
 
TEST PERFORMED, RESULTS UNKNOWN17990.2%
 
TEST TAKEN12890.1%
 
(Missing)17639719.7%
 

Length

Max length31
Median length16
Mean length13.43202742
Min length3

BAC_RESULT VALUE
Real number (ℝ≥0)

MISSING

Distinct count50
Unique (%)4.4%
Missing896257
Missing (%)99.9%
Infinite0
Infinite (%)0.0%
Mean0.16929503916449085
Minimum0.0
Maximum0.99
Zeros103
Zeros (%)< 0.1%
Memory size6.8 MiB

Quantile statistics

Minimum0
5-th percentile0
Q10.12
median0.17
Q30.21
95-th percentile0.3
Maximum0.99
Range0.99
Interquartile range (IQR)0.09

Descriptive statistics

Standard deviation0.09811609147
Coefficient of variation (CV)0.5795568019
Kurtosis12.80452339
Mean0.1692950392
Median Absolute Deviation (MAD)0.05
Skewness1.780141616
Sum194.52
Variance0.009626767406
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0103< 0.1%
 
0.1783< 0.1%
 
0.1882< 0.1%
 
0.2174< 0.1%
 
0.1469< 0.1%
 
0.260< 0.1%
 
0.1959< 0.1%
 
0.1555< 0.1%
 
0.2250< 0.1%
 
0.1648< 0.1%
 
Other values (40)4660.1%
 
(Missing)89625799.9%
 
ValueCountFrequency (%) 
0103< 0.1%
 
0.013< 0.1%
 
0.024< 0.1%
 
0.0314< 0.1%
 
0.0410< 0.1%
 
ValueCountFrequency (%) 
0.991< 0.1%
 
0.951< 0.1%
 
0.881< 0.1%
 
0.81< 0.1%
 
0.791< 0.1%
 

CELL_PHONE_USE
Boolean

MISSING

Distinct count2
Unique (%)0.2%
Missing896261
Missing (%)99.9%
Memory size6.8 MiB
Y
 
746
N
 
399
(Missing)
896261
ValueCountFrequency (%) 
Y7460.1%
 
N399< 0.1%
 
(Missing)89626199.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

PERSON_IDPERSON_TYPECRASH_RECORD_IDRD_NOVEHICLE_IDCRASH_DATESEAT_NOCITYSTATEZIPCODESEXAGEDRIVERS_LICENSE_STATEDRIVERS_LICENSE_CLASSSAFETY_EQUIPMENTAIRBAG_DEPLOYEDEJECTIONINJURY_CLASSIFICATIONHOSPITALEMS_AGENCYEMS_RUN_NODRIVER_ACTIONDRIVER_VISIONPHYSICAL_CONDITIONPEDPEDAL_ACTIONPEDPEDAL_VISIBILITYPEDPEDAL_LOCATIONBAC_RESULTBAC_RESULT VALUECELL_PHONE_USE
0O10DRIVER2e31858c0e411f0bdcb337fb7c415aa93763cf2f23e02fb278e2ed4cf855d36c5a83f270e6f2e0e234ec48b4a0e55a2c51e37b0d40677392c25079c51727a57fHY36870810.008/04/2015 12:40:00 PMNaNCHICAGOIL60641MNaNNaNNaNUSAGE UNKNOWNNOT APPLICABLENONENO INDICATION OF INJURYNaNNaNNaNFAILED TO YIELDUNKNOWNNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
1O100DRIVERe73b35bd7651b0c6693162bee0666db159b2890143700936741e9ad9cfc1b0d7f71c0f63dc0594ecdd6fd2183fa2dcd67dfa76a4799839a00a81bc45e3966bb1HY37401896.007/31/2015 05:50:00 PMNaNELK GROVEIL60007MNaNNaNNaNSAFETY BELT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNFOLLOWED TOO CLOSELYUNKNOWNNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
2O1000DRIVERf2b1adeb85a15112e4fb7db74bff440d6ca53ff7a21e10490081685db85460a53c0f3d52dc0e181e69348053ef3198dccbc24e01df9179e42331bc4345e6f28dHY407431954.009/02/2015 11:45:00 AMNaNCHICAGOILNaNM31.0ILDUSAGE UNKNOWNDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNUNKNOWNUNKNOWNNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
3O10000DRIVER15a3e24fce3ce7cd2b02d44013d1a93ff2fbdca80632ecca87150673b21e819bd8649e309f867728fd25a9a9fcb6f0f2c00c42ab6733e4d6945262dbe6c90c0eHY4841489561.010/31/2015 09:30:00 PMNaNSKOKIEIL60076M29.0ILDSAFETY BELT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNONENOT OBSCUREDNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
4O100001DRIVER2fcefeab458932d8b1b12e103c18c50adc659943cccd4b17fc22124a16b1ba9df582f703cd07e7ba892d00b904c9a9849b7c0b6a39063c00e829a752fe22f206HZ52561996762.011/15/2016 05:45:00 PMNaNNaNNaNNaNXNaNNaNNaNUSAGE UNKNOWNDEPLOYMENT UNKNOWNUNKNOWNNO INDICATION OF INJURYNaNNaNNaNUNKNOWNUNKNOWNUNKNOWNNaNNaNNaNTEST NOT OFFEREDNaNNaN
5O100002DRIVER2fcefeab458932d8b1b12e103c18c50adc659943cccd4b17fc22124a16b1ba9df582f703cd07e7ba892d00b904c9a9849b7c0b6a39063c00e829a752fe22f206HZ52561996754.011/15/2016 05:45:00 PMNaNCHICAGOIL60619F63.0NaNNaNSAFETY BELT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNONENOT OBSCUREDNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
6O100003DRIVERc2f21fd14725cf30d43ae59057e59b2e225c9e62a846e7995a6fbf041a23f4578ac02ec12570a1451580b566e134a1aeccceabfc653a1b9a8ca87d122d2bd3e8HZ52562996757.011/22/2016 01:45:00 PMNaNPARKRIDGEILNaNF20.0ILDUSAGE UNKNOWNDEPLOYMENT UNKNOWNNONENO INDICATION OF INJURYNaNNaNNaNFOLLOWED TOO CLOSELYNOT OBSCUREDNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
7O100004DRIVERc2f21fd14725cf30d43ae59057e59b2e225c9e62a846e7995a6fbf041a23f4578ac02ec12570a1451580b566e134a1aeccceabfc653a1b9a8ca87d122d2bd3e8HZ52562996755.011/22/2016 01:45:00 PMNaNCHICAGOIL60646M18.0ILDSAFETY BELT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYREFUSEDNaNNaNNONENOT OBSCUREDNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN
8O100005DRIVER5bfddf5ec81892bf71c2ba0f807c1ec112fd7ecdecbf3c6abbcea4314d7333fc6a4a058537ee8e97dbbb73019444b810f8aabe493003563d25736a5bd2c278a9HZ52560696759.011/19/2016 06:35:00 PMNaNBUFFALO GROVEIL60089F30.0ILNaNUSAGE UNKNOWNDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNIMPROPER LANE CHANGEUNKNOWNUNKNOWNNaNNaNNaNTEST NOT OFFEREDNaNNaN
9O100006DRIVER5bfddf5ec81892bf71c2ba0f807c1ec112fd7ecdecbf3c6abbcea4314d7333fc6a4a058537ee8e97dbbb73019444b810f8aabe493003563d25736a5bd2c278a9HZ52560696760.011/19/2016 06:35:00 PMNaNCHICAGOIL60625M50.0ILDSAFETY BELT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNONENOT OBSCUREDNORMALNaNNaNNaNTEST NOT OFFEREDNaNNaN

Last rows

PERSON_IDPERSON_TYPECRASH_RECORD_IDRD_NOVEHICLE_IDCRASH_DATESEAT_NOCITYSTATEZIPCODESEXAGEDRIVERS_LICENSE_STATEDRIVERS_LICENSE_CLASSSAFETY_EQUIPMENTAIRBAG_DEPLOYEDEJECTIONINJURY_CLASSIFICATIONHOSPITALEMS_AGENCYEMS_RUN_NODRIVER_ACTIONDRIVER_VISIONPHYSICAL_CONDITIONPEDPEDAL_ACTIONPEDPEDAL_VISIBILITYPEDPEDAL_LOCATIONBAC_RESULTBAC_RESULT VALUECELL_PHONE_USE
897396P204385PASSENGER93ae9608099cc936377a3c103955600be620c8361e59a1d79a686981683c2cf58425710634c96c17c43d69623005cfdcc4876a010c3090b66541b13318f24c05JD254733848712.006/05/2020 02:45:00 PM7.0CHICAGOILNaNF25.0NaNNaNSAFETY BELT NOT USEDDID NOT DEPLOYNONENONINCAPACITATING INJURYJACKSON PARKCFD55NaNNaNNaNNaNNaNNaNNaNNaNNaN
897397P204386PASSENGER93ae9608099cc936377a3c103955600be620c8361e59a1d79a686981683c2cf58425710634c96c17c43d69623005cfdcc4876a010c3090b66541b13318f24c05JD254733848712.006/05/2020 02:45:00 PM7.0CHICAGOIL60649F68.0NaNNaNSAFETY BELT NOT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897398P204387PASSENGER93ae9608099cc936377a3c103955600be620c8361e59a1d79a686981683c2cf58425710634c96c17c43d69623005cfdcc4876a010c3090b66541b13318f24c05JD254733848712.006/05/2020 02:45:00 PM7.0CHICAGOIL60637F53.0NaNNaNSAFETY BELT NOT USEDDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897399P204389PASSENGER93ae9608099cc936377a3c103955600be620c8361e59a1d79a686981683c2cf58425710634c96c17c43d69623005cfdcc4876a010c3090b66541b13318f24c05JD254733848712.006/05/2020 02:45:00 PM7.0CHICAGOIL60637F39.0NaNNaNSAFETY BELT NOT USEDNOT APPLICABLENONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897400P204390PASSENGER93ae9608099cc936377a3c103955600be620c8361e59a1d79a686981683c2cf58425710634c96c17c43d69623005cfdcc4876a010c3090b66541b13318f24c05JD254733848712.006/05/2020 02:45:00 PM7.0CHICAGOIL60637F17.0NaNNaNSAFETY BELT NOT USEDNOT APPLICABLENONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897401P204448PASSENGER0abfcb63aff48dbc1c1b628ea327e320f315f557c9f7d9793eecf1c26e89f866c361c03eb7b63ec22c4de39d03589c9fc850e10ba64042fcb74dcb3943d4f1aaJD254459849068.006/05/2020 10:57:00 AM6.0CHICAGOIL60644F55.0NaNNaNSAFETY BELT USEDDID NOT DEPLOYNONEREPORTED, NOT EVIDENTNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897402P204454PASSENGER02b88fc435881e3ede873b59dcca638305a953f0e29b64e53d09901c59d012ce396988e41e4a4f4e612c3b0a5006db338d8315f3be1307b436eb708bfa967a9dJD254908848444.006/05/2020 05:00:00 PM3.0NaNILNaNXNaNNaNNaNUSAGE UNKNOWNNOT APPLICABLENONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897403P204866PASSENGER4d4315d6432fab513f270f7cd9e60bd5b1b03be7524e38e7de0231505d76ced4e886bb7cc13aec0365c85f772f93b18fe96863d7f4ef30000f2856e46bae8b25JD259079850812.006/05/2020 12:20:00 PM4.0NaNNaNNaNM16.0NaNNaNSAFETY BELT USEDNOT APPLICABLENONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897404P204880PASSENGER6e0932de2a9bc3969bd708b25b233496a36fb0892232ef3a416a9aab4ea90e9d117fbb0b89c880fd37024f2dee65a92f138998914db69219f409d63ca123f520JD259879850898.006/05/2020 07:00:00 PM3.0NaNNaNNaNFNaNNaNNaNUSAGE UNKNOWNDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
897405P204881PASSENGER6e0932de2a9bc3969bd708b25b233496a36fb0892232ef3a416a9aab4ea90e9d117fbb0b89c880fd37024f2dee65a92f138998914db69219f409d63ca123f520JD259879850898.006/05/2020 07:00:00 PM6.0CHICAGOIL60637M5.0NaNNaNUSAGE UNKNOWNDID NOT DEPLOYNONENO INDICATION OF INJURYNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN